Overview

Dataset statistics

Number of variables13
Number of observations2772
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory281.7 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with qtt_invoices and 3 other fieldsHigh correlation
qtt_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
unique_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
total_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency is highly correlated with daily_purchase_rateHigh correlation
daily_purchase_rate is highly correlated with avg_recencyHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with unique_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtt_invoices and 1 other fieldsHigh correlation
qtt_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
unique_products is highly correlated with qtt_invoicesHigh correlation
total_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_basket_size is highly correlated with total_productsHigh correlation
gross_revenue is highly correlated with qtt_invoices and 1 other fieldsHigh correlation
qtt_invoices is highly correlated with gross_revenue and 1 other fieldsHigh correlation
unique_products is highly correlated with avg_unique_basket_sizeHigh correlation
total_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
avg_recency is highly correlated with daily_purchase_rateHigh correlation
daily_purchase_rate is highly correlated with avg_recencyHigh correlation
avg_basket_size is highly correlated with total_productsHigh correlation
avg_unique_basket_size is highly correlated with unique_productsHigh correlation
df_index is highly correlated with avg_recencyHigh correlation
gross_revenue is highly correlated with qtt_invoices and 4 other fieldsHigh correlation
qtt_invoices is highly correlated with gross_revenue and 4 other fieldsHigh correlation
unique_products is highly correlated with gross_revenue and 2 other fieldsHigh correlation
total_products is highly correlated with gross_revenue and 4 other fieldsHigh correlation
avg_ticket is highly correlated with total_prod_returnedHigh correlation
avg_recency is highly correlated with df_indexHigh correlation
total_prod_returned is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 2 other fieldsHigh correlation
daily_purchase_rate is highly skewed (γ1 = 46.06904053) Skewed
total_prod_returned is highly skewed (γ1 = 20.73295106) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 33 (1.2%) zeros Zeros
total_prod_returned has 1481 (53.4%) zeros Zeros

Reproduction

Analysis started2022-08-09 21:43:38.542943
Analysis finished2022-08-09 21:44:27.064153
Duration48.52 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct2772
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2250.265152
Minimum0
Maximum5694
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:27.281753image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile181.55
Q1899.75
median2060.5
Q33410.5
95-th percentile4957.95
Maximum5694
Range5694
Interquartile range (IQR)2510.75

Descriptive statistics

Standard deviation1526.176049
Coefficient of variation (CV)0.6782205414
Kurtosis-0.9562509049
Mean2250.265152
Median Absolute Deviation (MAD)1240.5
Skewness0.3797379825
Sum6237735
Variance2329213.334
MonotonicityStrictly increasing
2022-08-09T18:44:27.556342image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
29091
 
< 0.1%
28951
 
< 0.1%
28961
 
< 0.1%
28991
 
< 0.1%
29001
 
< 0.1%
29041
 
< 0.1%
29051
 
< 0.1%
29061
 
< 0.1%
29071
 
< 0.1%
Other values (2762)2762
99.6%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
56941
< 0.1%
56841
< 0.1%
56781
< 0.1%
56531
< 0.1%
56471
< 0.1%
56361
< 0.1%
56351
< 0.1%
56191
< 0.1%
56181
< 0.1%
56091
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2772
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15285.114
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:27.850096image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12626.55
Q113814.75
median15240.5
Q316780.5
95-th percentile17950.45
Maximum18287
Range5940
Interquartile range (IQR)2965.75

Descriptive statistics

Standard deviation1715.439416
Coefficient of variation (CV)0.1122294159
Kurtosis-1.207577693
Mean15285.114
Median Absolute Deviation (MAD)1484.5
Skewness0.01689580368
Sum42370336
Variance2942732.389
MonotonicityNot monotonic
2022-08-09T18:44:28.124082image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
144821
 
< 0.1%
170581
 
< 0.1%
177041
 
< 0.1%
169331
 
< 0.1%
137721
 
< 0.1%
162491
 
< 0.1%
141981
 
< 0.1%
139891
 
< 0.1%
179301
 
< 0.1%
Other values (2762)2762
99.6%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182651
< 0.1%
182631
< 0.1%
182611
< 0.1%
182601
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2758
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2830.005032
Minimum36.56
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:28.403375image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum36.56
5-th percentile264.539
Q1628.3675
median1169.4
Q32421.64
95-th percentile7405.9275
Maximum279138.02
Range279101.46
Interquartile range (IQR)1793.2725

Descriptive statistics

Standard deviation10438.70669
Coefficient of variation (CV)3.688582377
Kurtosis376.9967288
Mean2830.005032
Median Absolute Deviation (MAD)687.31
Skewness17.22344489
Sum7844773.95
Variance108966597.4
MonotonicityNot monotonic
2022-08-09T18:44:28.650513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1025.442
 
0.1%
745.062
 
0.1%
379.652
 
0.1%
598.22
 
0.1%
1078.962
 
0.1%
731.92
 
0.1%
1353.742
 
0.1%
2053.022
 
0.1%
1314.452
 
0.1%
3312
 
0.1%
Other values (2748)2752
99.3%
ValueCountFrequency (%)
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
70.021
< 0.1%
77.41
< 0.1%
84.651
< 0.1%
90.31
< 0.1%
93.351
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%
65039.621
< 0.1%

recency_days
Real number (ℝ≥0)

ZEROS

Distinct252
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.58297258
Minimum0
Maximum372
Zeros33
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:28.918412image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median29
Q373
95-th percentile211
Maximum372
Range372
Interquartile range (IQR)63

Descriptive statistics

Standard deviation68.35191887
Coefficient of variation (CV)1.207994486
Kurtosis3.450178388
Mean56.58297258
Median Absolute Deviation (MAD)23
Skewness1.901084359
Sum156848
Variance4671.984813
MonotonicityNot monotonic
2022-08-09T18:44:29.194477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.6%
487
 
3.1%
385
 
3.1%
285
 
3.1%
876
 
2.7%
1067
 
2.4%
966
 
2.4%
765
 
2.3%
1762
 
2.2%
2255
 
2.0%
Other values (242)2025
73.1%
ValueCountFrequency (%)
033
 
1.2%
199
3.6%
285
3.1%
385
3.1%
487
3.1%
543
1.6%
765
2.3%
876
2.7%
966
2.4%
1067
2.4%
ValueCountFrequency (%)
3721
 
< 0.1%
3661
 
< 0.1%
3601
 
< 0.1%
3583
0.1%
3541
 
< 0.1%
3371
 
< 0.1%
3362
0.1%
3341
 
< 0.1%
3332
0.1%
3301
 
< 0.1%

qtt_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct55
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.055916306
Minimum2
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:29.493385image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q12
median4
Q36
95-th percentile17
Maximum206
Range204
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.074222455
Coefficient of variation (CV)1.498406186
Kurtosis183.8507135
Mean6.055916306
Median Absolute Deviation (MAD)2
Skewness10.62222952
Sum16787
Variance82.34151316
MonotonicityNot monotonic
2022-08-09T18:44:29.781082image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2779
28.1%
3498
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
Other values (45)278
 
10.0%
ValueCountFrequency (%)
2779
28.1%
3498
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

unique_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct340
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.36147186
Minimum1
Maximum1786
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:30.143162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q129
median57
Q3105
95-th percentile239.45
Maximum1786
Range1785
Interquartile range (IQR)76

Descriptive statistics

Standard deviation98.7487723
Coefficient of variation (CV)1.184585278
Kurtosis80.60248651
Mean83.36147186
Median Absolute Deviation (MAD)33
Skewness6.351341471
Sum231078
Variance9751.320031
MonotonicityNot monotonic
2022-08-09T18:44:30.505480image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3738
 
1.4%
2437
 
1.3%
2636
 
1.3%
3335
 
1.3%
2534
 
1.2%
2834
 
1.2%
1832
 
1.2%
3032
 
1.2%
1530
 
1.1%
3129
 
1.0%
Other values (330)2435
87.8%
ValueCountFrequency (%)
119
0.7%
213
0.5%
317
0.6%
418
0.6%
522
0.8%
619
0.7%
721
0.8%
824
0.9%
923
0.8%
1020
0.7%
ValueCountFrequency (%)
17861
< 0.1%
17661
< 0.1%
13221
< 0.1%
11181
< 0.1%
8841
< 0.1%
8171
< 0.1%
7171
< 0.1%
7141
< 0.1%
6991
< 0.1%
6361
< 0.1%

total_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1637
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1665.883478
Minimum2
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:30.934929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile119.55
Q1330
median703.5
Q31478
95-th percentile4568.1
Maximum196844
Range196842
Interquartile range (IQR)1148

Descriptive statistics

Standard deviation5883.560113
Coefficient of variation (CV)3.531795706
Kurtosis488.0670946
Mean1665.883478
Median Absolute Deviation (MAD)452
Skewness18.24506239
Sum4617829
Variance34616279.6
MonotonicityNot monotonic
2022-08-09T18:44:31.260460image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
2468
 
0.3%
1508
 
0.3%
2197
 
0.3%
2007
 
0.3%
3007
 
0.3%
4937
 
0.3%
12007
 
0.3%
2727
 
0.3%
2607
 
0.3%
Other values (1627)2696
97.3%
ValueCountFrequency (%)
21
< 0.1%
161
< 0.1%
171
< 0.1%
191
< 0.1%
201
< 0.1%
251
< 0.1%
272
0.1%
301
< 0.1%
321
< 0.1%
332
0.1%
ValueCountFrequency (%)
1968441
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%
502551
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct2770
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.50911923
Minimum2.150588235
Maximum1687.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:31.592711image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.852205466
Q112.41296977
median17.93606578
Q325.00599826
95-th percentile86.4856545
Maximum1687.2
Range1685.049412
Interquartile range (IQR)12.59302849

Descriptive statistics

Standard deviation67.31774107
Coefficient of variation (CV)2.206479333
Kurtosis182.3411047
Mean30.50911923
Median Absolute Deviation (MAD)6.32595137
Skewness10.92302982
Sum84571.27852
Variance4531.678263
MonotonicityNot monotonic
2022-08-09T18:44:32.003166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
4.1622
 
0.1%
6.2697008551
 
< 0.1%
32.597751
 
< 0.1%
19.030483871
 
< 0.1%
28.554516131
 
< 0.1%
12.800681821
 
< 0.1%
6.3962146891
 
< 0.1%
26.087971011
 
< 0.1%
17.984615381
 
< 0.1%
Other values (2760)2760
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%
615.751
< 0.1%
602.45313231
< 0.1%
591.70666671
< 0.1%

avg_recency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1155
Distinct (%)41.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.74240252
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:32.300774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13
Q134.125
median59
Q399
95-th percentile224
Maximum366
Range365
Interquartile range (IQR)64.875

Descriptive statistics

Standard deviation66.49989571
Coefficient of variation (CV)0.8445245964
Kurtosis3.687351115
Mean78.74240252
Median Absolute Deviation (MAD)30
Skewness1.830983679
Sum218273.9398
Variance4422.23613
MonotonicityNot monotonic
2022-08-09T18:44:32.587555image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7021
 
0.8%
4618
 
0.6%
5517
 
0.6%
4916
 
0.6%
3116
 
0.6%
9116
 
0.6%
2115
 
0.5%
3515
 
0.5%
4215
 
0.5%
2814
 
0.5%
Other values (1145)2609
94.1%
ValueCountFrequency (%)
19
0.3%
24
0.1%
2.8615384621
 
< 0.1%
36
0.2%
3.3303571431
 
< 0.1%
3.3513513511
 
< 0.1%
45
0.2%
4.1910112361
 
< 0.1%
4.2758620691
 
< 0.1%
4.51
 
< 0.1%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3641
 
< 0.1%
3631
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

daily_purchase_rate
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04972001241
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:32.910842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008746355685
Q10.01577911314
median0.0243902439
Q30.04166666667
95-th percentile0.1153846154
Maximum17
Range16.99455041
Interquartile range (IQR)0.02588755353

Descriptive statistics

Standard deviation0.3377158337
Coefficient of variation (CV)6.792352161
Kurtosis2294.878341
Mean0.04972001241
Median Absolute Deviation (MAD)0.01069454458
Skewness46.06904053
Sum137.8238744
Variance0.1140519843
MonotonicityNot monotonic
2022-08-09T18:44:33.249806image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.6%
0.0833333333315
 
0.5%
0.0909090909115
 
0.5%
0.0294117647114
 
0.5%
0.0344827586214
 
0.5%
0.0256410256413
 
0.5%
0.0769230769213
 
0.5%
0.0212765957413
 
0.5%
Other values (1215)2624
94.7%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
21
 
< 0.1%
1.1428571431
 
< 0.1%
18
0.3%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

total_prod_returned
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct203
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.73448773
Minimum0
Maximum8004
Zeros1481
Zeros (%)53.4%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:33.536022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q39
95-th percentile95.45
Maximum8004
Range8004
Interquartile range (IQR)9

Descriptive statistics

Standard deviation235.4560695
Coefficient of variation (CV)7.419564212
Kurtosis569.4476922
Mean31.73448773
Median Absolute Deviation (MAD)0
Skewness20.73295106
Sum87968
Variance55439.56066
MonotonicityNot monotonic
2022-08-09T18:44:33.838780image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
663
 
2.3%
555
 
2.0%
1245
 
1.6%
839
 
1.4%
938
 
1.4%
Other values (193)651
23.5%
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
555
 
2.0%
663
 
2.3%
738
 
1.4%
839
 
1.4%
938
 
1.4%
ValueCountFrequency (%)
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%
15941
< 0.1%
15352
0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1936
Distinct (%)69.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean229.3617361
Minimum1
Maximum3868.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:34.141101image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45
Q1103.325
median172
Q3278.1214286
95-th percentile583.675
Maximum3868.65
Range3867.65
Interquartile range (IQR)174.7964286

Descriptive statistics

Standard deviation237.6557625
Coefficient of variation (CV)1.036161335
Kurtosis45.20826542
Mean229.3617361
Median Absolute Deviation (MAD)81
Skewness5.146920468
Sum635790.7323
Variance56480.26147
MonotonicityNot monotonic
2022-08-09T18:44:34.469596image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
869
 
0.3%
608
 
0.3%
758
 
0.3%
1367
 
0.3%
827
 
0.3%
1057
 
0.3%
737
 
0.3%
1977
 
0.3%
2087
 
0.3%
Other values (1926)2694
97.2%
ValueCountFrequency (%)
11
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
11.8751
< 0.1%
ValueCountFrequency (%)
3868.651
< 0.1%
28801
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%
2082.2258061
< 0.1%
20001
< 0.1%
1903.51
< 0.1%
1866.9333331
< 0.1%
18581
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct897
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.14667573
Minimum0.2
Maximum177
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-09T18:44:34.768279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile2
Q17.545454545
median13.5
Q322
95-th percentile45.1125
Maximum177
Range176.8
Interquartile range (IQR)14.45454545

Descriptive statistics

Standard deviation14.2623174
Coefficient of variation (CV)0.8317832347
Kurtosis10.01065379
Mean17.14667573
Median Absolute Deviation (MAD)6.666666667
Skewness2.246708129
Sum47530.58512
Variance203.4136977
MonotonicityNot monotonic
2022-08-09T18:44:35.035611image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
834
 
1.2%
1333
 
1.2%
932
 
1.2%
1632
 
1.2%
732
 
1.2%
1230
 
1.1%
1429
 
1.0%
629
 
1.0%
1729
 
1.0%
18.529
 
1.0%
Other values (887)2463
88.9%
ValueCountFrequency (%)
0.21
 
< 0.1%
0.253
 
0.1%
0.33333333336
0.2%
0.41
 
< 0.1%
0.40909090911
 
< 0.1%
0.512
0.4%
0.54545454551
 
< 0.1%
0.55555555561
 
< 0.1%
0.57142857141
 
< 0.1%
0.61764705881
 
< 0.1%
ValueCountFrequency (%)
1771
< 0.1%
1051
< 0.1%
1041
< 0.1%
981
< 0.1%
95.51
< 0.1%
94.333333331
< 0.1%
93.333333331
< 0.1%
89.6251
< 0.1%
871
< 0.1%
85.666666671
< 0.1%

Interactions

2022-08-09T18:44:23.164657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:44.501592image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:47.696824image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:51.011466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:54.042589image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:57.172383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:00.326380image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:03.635885image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:07.072402image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:10.240843image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:13.295646image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:16.482032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:19.936374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:23.389837image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:44.704559image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:47.987362image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:51.233309image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:54.261852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:57.403728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:00.559059image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:03.883821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:07.339356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:10.459383image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:13.517993image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:16.702533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:20.179036image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:23.628313image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:44.946566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:48.276927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:51.452901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:54.489193image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:57.643009image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:00.780854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:04.126197image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:07.577433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:10.682301image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:13.746677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:16.938425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:20.419124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:23.861169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:45.181719image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:48.516631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:51.678679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:54.699687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:57.871925image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:01.022005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:04.368230image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:07.816918image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:10.896604image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:13.981983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:17.187912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:20.662295image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:24.105058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:45.439964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:48.733162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:51.921866image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:54.926572image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:58.126248image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:01.261733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:04.656010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:08.063423image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:11.131022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:14.232080image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:17.435712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:20.906531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:24.349726image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:45.691558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:48.967544image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:52.162926image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:55.173877image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:58.374427image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:01.508058image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:04.924070image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:08.315912image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:11.370746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:14.486838image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:17.688712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:21.168513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:24.589395image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:45.945336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:49.204929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:52.412409image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:55.407702image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:58.609638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:01.749181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:05.216001image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:08.554743image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:11.615029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:14.714965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:18.089114image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:21.418749image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:24.823396image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:46.195472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:49.441099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:52.640050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:55.674282image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:58.848148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:01.996151image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:05.506782image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:08.794695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:11.867433image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:14.957113image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:18.427468image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:21.687237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:25.066961image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:46.427400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:49.696010image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:52.858937image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:55.921195image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:59.091654image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:02.255781image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:05.768526image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:09.038984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:12.101820image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:15.206189image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:18.671603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:21.933177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:25.304900image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:46.645966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:49.933018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:53.095560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:56.182969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:59.332247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:02.486982image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:06.036199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:09.275276image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:12.337520image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:15.462165image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:18.908692image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:22.184679image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:25.548020image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:46.884296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:50.230707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:53.334107image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:56.425927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:59.577091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:02.717233image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:06.288540image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:09.512932image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:12.580053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:15.713674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:19.150425image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:22.437246image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:25.794931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:47.161767image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:50.496777image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:53.578148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:56.679740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:59.827252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:02.959284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:06.545527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:09.755365image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:12.810929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:15.971599image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:19.401563image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:22.676024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:26.036655image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:47.422634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:50.750564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:53.812609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:43:56.923988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:00.079906image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:03.324702image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:06.789741image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:10.002479image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:13.057976image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:16.240570image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:19.681348image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-09T18:44:22.918951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-08-09T18:44:35.271809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-09T18:44:35.598499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-09T18:44:35.938255image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-09T18:44:36.391268image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-09T18:44:26.412108image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-09T18:44:26.883122image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtt_invoicesunique_productstotal_productsavg_ticketavg_recencydaily_purchase_ratetotal_prod_returnedavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.021.01733.018.1522221.00000017.00000040.050.9705880.617647
11130473232.5956.09.0105.01390.018.90403552.8333330.02830235.0154.44444411.666667
22125836705.382.015.0114.05028.028.90250026.5000000.04032350.0335.2000007.600000
3313748948.2595.05.024.0439.033.86607192.6666670.0179210.087.8000004.800000
4415100876.00333.03.01.080.0292.00000020.0000000.07317122.026.6666670.333333
55152914623.3025.014.061.02102.045.32647126.7692310.04011529.0150.1428574.357143
66146885630.877.021.0148.03621.017.21978619.2631580.057221399.0172.4285717.047619
77178095411.9116.012.046.02057.088.71983639.6666670.03352041.0171.4166673.833333
881531160767.900.091.0567.038194.025.5434644.1910110.243316474.0419.7142866.230769
99160982005.6387.07.034.0613.029.93477647.6666670.0243900.087.5714294.857143

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtt_invoicesunique_productstotal_productsavg_ticketavg_recencydaily_purchase_ratetotal_prod_returnedavg_basket_sizeavg_unique_basket_size
2762560917290525.243.02.092.0404.05.14941213.00.1428570.0202.00000046.000000
276356181478577.4010.02.02.084.025.8000005.00.3333330.042.0000001.000000
2764561917254272.444.02.0100.0252.02.43250011.00.1666670.0126.00000050.000000
2765563517232421.522.02.030.0203.011.70888912.00.1538460.0101.50000015.000000
2766563617468137.0010.02.05.0116.027.4000004.00.4000000.058.0000002.500000
2767564713596697.045.02.0133.0406.04.1990367.00.2500000.0203.00000066.500000
27685653148931237.859.02.072.0799.016.9568492.00.6666670.0399.50000036.000000
2769567814126706.137.03.014.0508.047.0753333.00.75000050.0169.3333334.666667
27705684135211092.391.03.0312.0733.02.5112414.50.3000000.0244.333333104.000000
2771569415060301.848.04.080.0262.02.5153331.02.0000000.065.50000020.000000